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The quantum relative entropy is frequently used as a distance, or distinguishability measure between two 
quantum states. In this paper we study the relation between this measure and a number of other measures used 
for that purpose, including the trace norm distance. More specifically, we derive lower and upper bounds on the 
relative entropy in terms of various distance measures for the difference of the states based on unitarily invariant 
norms. The upper bounds can be considered as statements of continuity of the relative entropy distance in the 
sense of Fannes. We employ methods from optimisation theory to obtain bounds that are as sharp as possible. 
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I. INTRODUCTION 

The relative entropy of states of quantum systems is a mea- 
sure of how well one quantum state can be operationally dis- 
tinguished from another. Defined as 

S(p\\a)=Tr[p(logp-loga)} 

for states p and er, it quantifies the extent to which one hypoth- 
esis p differs from an alternative hypothesis a in the sense 
of quantum hypothesis testing H, [ÏÏl [Ï3 [H, |53|. Dating 
back to work by Umegaki 1 2 1 ] , the relative entropy is a quan- 
tum generalisation of the Kullback-Leibler relative entropy for 
probability distributions in mathematical statistics 11411 . The 
quantum relative entropy plays an important role in quantum 
statistical mechanics 1 25] and in quantum information theory, 
where it appears as a central notion in the study of capaci- 
ties of quantum channels 1 12, 19, 20, 22] and in entanglement 
theory 12311313. 

In finite- dimensionat Hilbert spaces, the relative entropy 
functional is manifestly continuous 12511 . see also footnotes 
l2rÜl . Í27ll . In particular, if {<r n } n is a sequence of states of 
fixed finite dimension satisfying 

lim ||(t„-(t||i= lim Tr|<7„ - a\ = (1) 

n — >oo n — >oo 

for a given state er, then 

lim S(a n \\a) = 0. 

n — >oc 

In practical contexts, however, more precise estimates can be 
necessary, in particular in an asymptotic setting. Consider a 
state p on a Hilbert space Ti, and a sequence {a n } n , where 
<j n is a state on Tí® n , the n-fold tensor product of TL. The 
sequence is said to asymptotically approximate p if a n tends 
to p® n for n — ► oo. More precisely, one typically requires 
that 

lim ||<r n - p®"||i = 0. (2) 
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Now, as an alternative to the trace norm distance, one can con- 
sider the use of the Bures distance. The Bures distance D is 
defined as 

D(p 1 ,p 2 ) = 2(l-F( Pl ,p 2 )) 1/ \ 
in terms of the Uhlmann fidelity 

F{ Pl ,p 2 ) = Tr{p\' 2 p,p\' 2 )^. 
Because of the inequalities |3] 

1 - F( Pl ,p 2 ) < Tr \ Pl - p 2 \ < (1 - F 2 ( Pl , p 2 )) 1/2 , (3) 

the trace norm distance tends to zero if and only if the Bures 
distance tends to zero, which shows that, for the purpose of 
state discrimination, both distances are essentially equivalent 
and one can use whichever is most convenient. 

A natural question that now immediately arises is whether 
the same statement is true for the relative entropy. To íïnd an 
answer to that one would need inequalities like © connecting 
the quantum relative entropy, used as a distance measure, to 
the trace norm distance, or similar distance measures. 

In this paper, we do just that: we íïnd upper bounds on 
the relative entropy functional in terms of various norm dif- 
ferences of the two states. As such, the presented bounds are 
very much in the same spirit as Fannes' inequality, sharpening 
the notion of continuity for the von Neumann entropy LJ . It 
has already to be noted here that one of the main stumbling 
blocks in this undertaking is the well-known fact that the rel- 
ative entropy is not a very good distance measure, as it gives 
iníïnite distance between non-identical pure states. However, 
we will present a satisfactory solution, based on using the min- 
imal eigenvalue of the state that is the second argument of the 
relative entropy. Apart from the tòpic of upper bounds, we 
also study lower bounds on the relative entropy, giving a com- 
plete picture of the relation between norm based distances and 
relative entropy. 

We start in Section II with presenting a short motivation of 
how this paper came about. Section III contains the relevant 
notations, deíïnitions and bàsic results that will be used in the 
rest of the paper. In Section IV we discuss some properties 
of unitarily invariant (UI) norms that will allow us to consider 
all UI norms in one go. The íïrst upper bounds on the relative 
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entropy 5(jo||cr) are presented in Section V, one bound being 
quadratic in the trace norm distance of p and cr and the other 
logarithmic in the minimal eigenvalue of cr. Both bounds sep- 
arately capture an essential behaviour of the relative entropy, 
and it is argued that finding a single bound that captures both 
behaviours at once is not a trivial undertaking. Nevertheless, 
we will succeed in doing this in Section VII by constructing 
upper bounds that are as sharp as possible for given trace norm 
distance and minimal eigenvalue of cr. In Section VI we use 
similar techniques to derive lower bounds that are as sharp as 
possible. Finally, in Section IX, we come back to the issue of 
state discrimination mentioned at the beginning. 



II. BACKGROUND 

In Ref. J3| (Example 6.2.31, p. 279) we find the following 
upper bound on the relative entropy, vàlid for all p and for 
non-singular cr: 



S(p\\a) 



< 



A m in(o 



(4) 



This bound is linear in the operator norm distance between 
p and a and has al/x dependence on A m i n (cr). For several 
purposes, such a bound is not necessarily sharp enough. The 
Ímpetus for the present paper was given by the observation 
that sharper upper bounds on the relative entropy should be 
possible than @. Specifically, there should exist bounds that 
are 

1 . quadratic in p — cr, and/or 

2. depend on A m i n (<7) in a logarithmic way. 

A simple argument shows that a logarithmic dependence on 
Amin(c) can be achieved instead of an 1/x dependence. Note 
that > Ioga > 1 • logÀ min (er). Thus, 

S(p\\a) = Tr[p(logp- logcr)] 

< —S(p)- log Amin (o - ) 

< -log A min (c). (5) 

Concerning the quadratic dependence on p — cr, we can put 
p = <t + eA, with Tr[A] = 0, and calculate the derivative 

limS*(cr + eA||cr)/e 

and find that this turns out to be zero for any non-singular 
cr. Indeed, the gradient of the relative entropy |cr) with 
respect to p is 1 + logp — logcr (see Lemma0. Hence, for 
p = a and Tr[A] = 0, 

limS(er + £A||cr)/£ = Tr[A(l + log cr - log a)\ = 0. 

This seems to imply that for small e, S{a + sA\\a) must at 
least be quadratic in e, and, therefore, upper bounds might 
exist that indeed are quadratic in e. Furthermore, Ref. iflStl 
contains the following quadratic lower bound (Th. 1.15) 



s(p\W)> -\\ P - 



(6) 



The rest of the paper will be devoted to finding firm evidence 
for these intuitions, by exploring the relation between relative 
entropy and norm based distances, culminating in a number 
of bounds that are the sharpest possible. 



m. NOTATION 

In this paper, we will use the following notations. We will 
use the Standard vector and matrix bases: e l is the vector with 
the i-th element equal to 1, and all other elements being equal 
to 0. e ÍJ is the matrix with i,j element equal to 1 and all 
other elements 0. For any diagonal matrix A, we write Ai 
as a shorthand for A^i, and Diag(ai, 02, . . .) is the diagonal 
matrix with <Zj as diagonal elements. We reserve two symbols 
for the following special matrices: 

,1,1 



and 



£ := Diag(l, 0, . . . , 0) = e 1 



F:=Diag(l,-l,0,...,0) = e 1 



(7) 



(8) 



The positive semi-definite order is denoted using the > sign: 
A> B \ïï A — B >0 (positive semi-definite). 

The (quantum) relative entropy is denoted as S f (p||cr) = 
Tr[p(log p — logcr)]. All logarithms in this paper are natural 
logarithms. When p and a are both diagonal (i.e., when we 
encounter the commutative, classical case) we use the short- 
hand 



S{{r 1 ,r 2 ,...)\\{s 1 ,s 2 , 
:= 5(Diag(ri,r 2 , . . 



■)) 

| Diag(si,s 2 , ...)). 



Lemma 1 The gradient of the relative entropy S{p\\a) with 
respect to its first argument p, being non-singular, is given by 
1 + log p - log cr. 

Proof. The calculation of this derivative is straightforward. 
Since the classical entropy function x 1 — ► h(x) :— — 2; log 2; 
is continuously differentiable on (0, 1), and therefore, 

S(p + eA)=Tr[Ah'(p)}, 



de 



we can write 



\imS{p + eA\\o-)/e 

6^0 



Tr[A(l + log/>- logcr)]. 



□ 



Finally, we recali a number of series expansions related to 
the logarithm, which are vàlid for — 1 < y < 1, 



log(l-y) = -J2 



V 



fe=i 



,2k 



log(l + y)+log(l-y) = -XIV' 



fc=i 



„2fe+l 



log(l + y)-log(l-y) - SX^TT 



fe=0 
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These expansions will be made extensive use of. 



IV. UNITARILY INVARIANT NORMS 

In this section we collect the main definitions and known 
results about unitarily invariant norms along with a number of 
refinements that will prové to be very useful for the rest of the 
paper. 

A unitarily invariant norm (UI norm), denoted with |||.|||, 
is a norm on square matrices that satisfies the property 



|||£MV||| = ||I4II 



(9) 



for all A and for unitary U, V (| 1], Section IV.2). Perhaps the 
most important property of UI norms is that they only depend 
on the singular vàlues of the matrix A. If A is positive semi- 
definite, then 1 1^4| 1 1 depends only on the eigenvalues of A. 

A very important class of UI norms are the Ky Fan norms 
||·||(fe), which are defined as follows: for any given square 
n x n matrix A, with singular vàlues (A) (sorted in non- 
increasing order) and 1 < k < n, the fc-Ky Fan norm is the 
sum of the k largest singular vàlues of A: 



\\A\\ (k) = Y,s l M)- 

Two special Ky Fan norms are the operator norm and the trace 
norm, 



1141(1), |4lTr=H4li = H4l(r 



(10) 



The importance of the Ky Fan norms derives from their lead- 
ing role in Ky Fan's Dominance Theorem (Ref. 1 1], Theorem 
IV.2.2): 

Theorem 1 (Ky Fan Dominance) Let A and B be any two 

n x n matrices. If B majorises A in all the Ky Fan norms, 

Pll(fc) < ll^lb), 

for all k — 1,2,..., then it does so in all other UI norms as 
well, 



< 



\B\ 



From Ky Fan's Dominance Theorem follows the following 
well-known norm dominance statement. 

Lemma 2 For any matrix A, and any unitarily invariant norm 



< 



E 



<U\\i. 



Proof. We need to show that, for every A, 

,)£|||<IPIII<lll(Plli)£| 



holds for every unitarily invariant norm. By Ky Fan's dom- 
inance theorem, we only need to show this for the Ky Fan 
norms. All the Ky Fan norms of E are 1, and 



II4U = < < = \\A\\ X 

follows from the definition of the Ky Fan norms. 



□ 



The main mathematical object featuring in this paper is not 
the state, but rather the difference A of two states, A := p — a, 
and for that object a stronger dominance result obtains. We 
first show that the largest norm difference between two states 
occurs for orthogonal pure states. Indeed, by convexity of 
norms, |||p — a\\ \ is maximal in pure p and a. A simple cal- 
culation then reveals that, for any unitarily invariant norm, 



Wih\>\-\m\\\\ = (i-i<# 



This achieves its maximal value |||F||| for ifj orthogonal to 
4>, showing that it makes sense to normalise a norm dis- 
tance |||p — er||| by division by |||-F|||. We will call this a 
rescaled norm. We now have the following dominance result 
forrescaled norms of differences of states: 

Lemma 3 For any Hermitian A, with Tr[A] = 0, 

U\\i 



\F\ 



< 



< 



\F\ 



Note that equality can be obtained for any value of 1 1 \A\ \ \, by 
setting A = cF. 

Proof. We need to show, for all traceless Hermitian A, that 



^111 < 



< 



11411/2)^111 (11) 



holds for every unitarily invariant norm. Again by Ky Fan's 
dominance theorem, we only need to do this for the Ky Fan 
norms ||·||(fc). Since 



\F\ 



(fe) 



1, k = 1, 

2, k> 1, 



and 



\X\ 



1*11(1) < ||*l|(fe) < 11*1 



(d) 



\x\ 



the inequalities (II 1> follow trivially for k > 1. The case k = 1 
is covered by Lemma|4]below. □ 



Lemma 4 For any Hermitian A, with Tr [A] = 0, 

|l4li>2|l4U. 

Proof. Let the Jordan decomposition of A be 

A = A+-A-, (12) 

with A+, A_ > 0. Since Tr[A] = 0, clearly Tt[A + ] = 
Tr[A_] holds. Thus, ||A||i = Tr \A\ = Tr[A + ] + Tr[A_] = 
2Tr[A+]. Also, 



||4|oo =max(P + || 00 ,P_|| 00 ). 



(13) 
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Hence, \\A\loo < max(| \A + | |i, | \A- | |i) = Tr[A^ 



□ 



In this paper, we will also be dealing with A = p — <j under 
the constraint a > (31. Obviously we have 

(3 < í/d. 

We now show that under this constraint, any rescaled norm of 
A is upper bounded by 1 — (3. 

Lemma 5 For any state p, and states a such that a > (31, 

T :=\\\p-a\\\/\\\F\\\<l- /3. (14) 



Proof. We proceed by maximising T under the constraint cr > 
(31. Convexity of norms yields that T is maximal when p and 
a are extremal 0, hence in p being a pure state \<f>) (<j)\ and a 
being of the form 

«7 = J 81 + (l-j9d)lV'><# (15) 
Fixing (f> = e 1 , we need to maximise 

iiic^-^i-a- ^«01^)^1 in 

over all ip. Put ip — (cos a, sin a, 0, . . . , 0) T , then the eigen- 
values of the matrix are 

A± = ((d - 2)(3 ± ({(3df + 4(1 - (3d) sin 2 a) 1/2 ) /2 



and — (3 (with multiplicity d — 2). One finds that, for d > 2, 

A_(_, | A ] > (3, for any value of a, and both A + and |A_| are 

maximal for a — tt/2, as would be expected. The maximal 
Ky Fan norms of this matrix are therefore 

ll·ll(i) = A+ = l-A 

ll·ll(fc) = \+ + \\-\ + (k-2)p=(2-f3d) + (k-2)(3, 

for k > 1. Hence, for every Ky Fan norm, the maximum 
norm value is obtained for orthogonal <j> and ip. By the Ky 
Fan dominance theorem, this must then hold for any UI norm. 
In case of the trace norm, as well as of the operator norm, the 
rescaled value of the maximum is 1 — /?. By Lemma|3] this 
must then be the maximal value for any rescaled norm. □ 



Remark. For the Schatten q-norm, |||F||| = 2 1 /-?. The 
largest value of |||.F||| is 2, obtained for the trace norm, and 
the smallest value is 1 , for the operator norm. 



V. SOME SIMPLE UPPER BOUNDS 

In this Section we present our first attempts at finding upper 
bounds that capture the essential features of relative entropy. 
In Subsection A we present a bound that is indeed quadratic 
in the trace norm distance, the existence of which was already 
hinted at in Section II. Likewise, in Subsection B, we find 



a bound that is logarithmic in the minimal eigenvalue of a, 
again in accordance with previous intuition. Combining the 
two bounds into one that has both of these features turns out 
to be not so easy. In fact, in Subsection C a number of ar- 
guments are given that initially hinted at the impossibility of 
realising sich a combined bound. Nevertheless, we will suc- 
ceed in finding a combined bound later on in the paper, by 
using techniques from optimisation theory [|2j]. 



A. A quadratic upper bound 

Lemma 6 For any positive definite matrix A and Hermitian 
A such that A + A is positive definite, 

/•oo 

log(i4 + A) - log(A) < / dx(A + x)- 1 A(A + x)- 1 . 
Jo 

Proof. Since the logarithm is strictly matrix concave C3. for 
allí g [0,1]: 

log((l - t)A + tB) > (1 - í) \og{A) + t log(B). 
Setting B = A + A and rearranging terms then gives 
\og(A + íA) - \og(A) 



t 



>log(i4 + A)-log(A), 



for all t G [0, 1]. A fortiori, this holds in the limit for t going to 
zero, and then the left-hand side is just the Fréchet derivative 
of log at A in the direction A. □ 



This Lemma allows us to give a simple upper bound on 
S{a + A| \a). Note that if A > B then Tr[CA] > Tï[CB] for 
any C > 0. Therefore, we arrive at 

S(p\\a) = Tr[(a + A)(log( ( 7 + A)-log(a))] 

< / dxTï[{a + A)(<T + xy 1 A(a + X y 1 } 
Jo 

poc 

= / dxTr[(a + x)- 1 o-(<T + x)- 1 A] 
Jo 

+ / dírTr[A(cr + x)- 1 A{cr + x)- 1 }. 
Jo 

The first integral evaluates to Tr[A], because 



dx- 



(s + x) 2 



1 



for any s > 0, and therefore gives the value 0. The second 
integral can be evaluated most easily in a basis in which o is 
diagonal. Denoting by Sj the eigenvalues of c, we get 

/>oo 

/ dxTriAia + x^Aia + x)- 1 } 
Jo 

/>oo 

= Y^AjjA^j / dx(si + xy 1 (s J ■+ x)" 1 
id Jo 

= ^A^Aj-, 1085 ' ~ l0gSj +^(A M ) 2 ~- (16) 



Si Sj 
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The coefficients of AjjA^j are easily seen to be always pos- 
itive, and furthermore, bounded from above by 1/A m i n (<j). 
Hence we get the upper bound 

-H / Tr[A 2 ] 



/■OC 

/ dxTr[A(a + xy 1 A(a + xy 1 } < 
Jo 



A m in(c) 



yielding an upper bound on the relative entropy which is, in- 
deed, quadratic in A: 



C. A combination of two bounds? 

The following question comes to mind almost automati- 
cally: can we combine the two bounds (I17t and d!9l > into a 
single bound that is both quadratic in A and logarithmic in 
Amin^)? This would certainly be a very desirable feature for 
a good upper bound. For instance, could it be true that 

S(p\\a)<C-Tr[(p-a) 2 ]-\\ogX min (a)\, 



Theorem 2 For states p and a with A = p — a, T = ||A||2 
and P = À min (er), 



T 2 



(17) 



for some constant C > 0? Unfortunately, the answer to this 
first attempt is negative. In fact, the proposed inequality is 
violated no matter how large the value of C. 

Proposition 1 For any r > there exist states a and p such 
that 



B. An upper bound that is logarithmic in the minimum 
eigenvalue of o 

We have already found a sharper bound than @ concerning 
its dependence on \ m í n {a). However, the bound (0 is not 
sharp at all concerning its dependence on p — a. A slight 
modification can greatly improve this. First note 

Tr[Aloga]| < ||A||i • HlogalU 

= Tr A • |logA min (a)|. 

This inequality can be sharpened, since Tr[A] = and a is a 
state. Let A = A + — A_ be the Jordan decomposition of A, 
then 



|Tr[Aloga]| < ||A+||i • | log X miíí (a) 



(18) 



and hence 



Tr[Alo gC r]| < Tr|A|/2- | log A mill (o-)|. 

Furthermore, we have Fannes' continuity of the von Neumann 
entropy [7], 



\S{a + A) - S(a)\ < Tlogd + min ( — TlogT, - 



where d is the dimension of the underlying Hilbert space and 
T := Tr |A|. Combining all this with 



S(p\\a) >r·Tr[(p- ( r) 2 ]·|logA min (a)|. 



(20) 



Proof. It suffices to consider the case that er, p are states acting 
on the Hilbert space C 2 , and that a and p commute. Hence, 
the statement must only be shown for two probability distri- 
butions 

P=(P,1-P), Q = {q,l-q). 

Without loss of generality we can require q to be in [0, 1/2]. 
Then, one has to show that for any r > there exist p, q such 
that the C°°-function /, defined as 



f(p,q,r) 



r((p-q) 2 + (2-p-q) 2 ) \ \og(q)\ 
(plog(p/q) + (1 -p)log[(l -p)/(l - q)}) . 



assumes a negative value. Now, for any r > 1, fix a q 6 
(0, 1/2) such that -4r(çlogç) < 1. Clearly, 



f(q,q,r) = 0, 



o. 



Then 



d 2 , 



= ~j ~ - 4rlog(ç) 

l-q q 

< 4r log(ç) < 0. 

q 



This means that there exists an e > such that f(p, q,r) < 
for p € [q, q + e], which in turn proves the validity of ( 1201 . □ 



S{a + A||ct) = —(S((t + A) - S(a)) - Tr[A log cr] 

gives rise to the subsequent upper bound, logarithmic in the 
smallest eigenvalue of cr. 

Theorem 3 For all states p and a ona d-dimensional Hilbert 
space, with T = \\p — a\\i and (3 = A m i n ((r), 

S(p\\a) <T\ogd + mm(-T\ogT,^j -I^l, (i 9) 



The underlying reason for this failure is that the two bounds 
dl9t and illi are incompatible, in the sense that there are two 
different regimes where either one or the other dominates. To 
see when the logarithmic dependence dominates, let us again 
take the basis where cr is diagonal, with Si being the main 
diagonal elements. When keeping A = p — a fixed while 
si = A m i n (cr) tends to zero, then 



lim S(a + A||cr)/|logsi| 
«i— >o 



Ai i < oo. 
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Hence, in the regime where X m i n (a) tends to zero and p — a 
is fixed, the bound j!9l > is the appropriate one. 

The other regime is the one where cr is fixed and p — cr tends 
to zero. This can be intuitively seen by considering the case 
where the states p and cr commute (the classical case). Let pi 
and qi be the diagonal elements of p and cr, respectively, in a 
diagonalising basis, and r, = Pi — qi- Then 

s(p\W) = + r i) ^i 1 + u/qí)- 

i 

We can develop S(p\ \a) as a Taylor series in the r%, giving 

Hence, in the regime where p — a tends to zero and cr is oth- 
erwise fixed, the relative entropy exhibits the behaviour of 
bound (ÏT7I . 

In terms of the matrix derivatives, this notion can be made 
more precise as follows. Denote the bound (I19> as 



9ÍPÏW) = 



Ti[(p - 



Amin {o ) 

for states p, cr, then clearly 

lim g(a + eA\\a)/e = 0. 



On using the integral representation of the second Fréchet 
derivative of the matrix logarithm 1 18], 



de 2 



6=0 



log(cr + eA) 



pOO 

-2 / dx(<r + x)- 1 A(a + xy 1 A(a + x)- 1 , 
Jo 



one obtains 

de 2 



e=0 

-2Tr 



S{a + eA\\a) 

a / dx(a + x)~ 1 A(<j + xY 1 A(a 
Ja 

/>oo 

A / dx(a + x)' 1 A(a + xy 1 



+2Tr 

The right hand side is bounded from above by 

d 2 



de 2 



E=0 



S{a + eA\\a) 



p OO 

< / dxTr[A(cr + a;)- 1 A(cr + a;)- 1 ], 



see Ref. ÍÏ51U8I1 . This bound can be written as in Eq. Jlót i. 
Therefore, one can conclude that 

d 2 d 2 
■j— S{<j + eA\\a) = g[cr + eA cr) 

0£ z e=0 0£ z e=0 

holds for all A satisfying Tr[A] = if and only if a = t/d, 
where d is the dimension of the underlying Hilbert space. 
These considerations seem to spell doom for any attempt at 
"unifying" the two kinds of upper bounds. However, below 
we will see how a certain change of perspective will allow us 
to get out of the dilemma. 



VI. A SHARP LOWER BOUND IN TERMS OF NORM 
DISTANCE 

We define S m i n {T) with respect to a norm to be the small- 
est relative entropy between two states that have a distance of 
exactly T in that norm, that is 



S min (T) =mm{S(p\\a) : \\\p - a\\\ = T} . 



(21) 



When one agrees to assign S(p\ \<r) = +oo for non-positive p, 
the definition of S'miri can be rephrased as 

S min (T)=mm{S(a + A\\<j) : |||A||| = T, Tr[A] = 0} . 

A.cr 

(22) 

Intuitively one would guess that S m [ n is monotonously in- 
creasing with T. The following lemma shows that this is true, 
but some care is required in proving it. 

Lemma 7 ForT x < T 2 , S min (Ti) < S min (T 2 ). 

Proof. Keep a fixed and define 

fa{T) = ndn{5((T + A| |er) : |||A||| = T, Tr[A] = 0} , 

so that Smin {T) = mm a f a (T). Considering S(a + A\\a) as 
a function of A, it is convex and minimal in the origin A = 0. 
Furthermore, for the norm balls 



we have 



B(T) :={A: |||A||| <T,Tr[A] = 0} (23) 



{0} = B(0) Ç B(T X ) ç B(T 2 ). (24) 



This is sufficient to prové that = / ff (0) < U{T X ) < f<j(T 2 ). 
Now, since this holds for any cr, it also holds when minimising 
over cr, and that is just the statement of the Lemma. □ 



As a direct consequence, a third equivalent definition of 

Smin(T) ÍS 

Smi n (T) = min {S(a + A\\a) : \ \\A\\\ > T, Tr[A] = 0}. 

A,íJ 

(25) 

We now show that one can restrict oneself to the commutative 
case. 

Lemma 8 The minimum in Eq. \22\ is obtained for cr and A 
commuting. 

Proof. Fix A and consider a basis in which A is diagonal. Let 
p i — ► Diag(p) be the completely positive trace-preserving 
map which, in that basis, sets all off-diagonal elements of p 
equal to zero. Thus Diag(A) = A. By monotonicity of the 
relative entropy, 

S(cr + A||cr) > S(Diag(cr)+A||Diag(cr)). 

Minimising over all states cr then gives 

min S(cr + A||cr) > min S(Diag(cr) + A|| Diag(cr)) 

a a 

= min {S(cr + A||cr) : [cr, A] = 0} . 
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FIG. 1: The function s defined in Eq. 1261 (middle curve), the lower 
bound 2x 2 (lower curve), and the upper bound — log(l — x) (upper 
curve). 



On the other hand, the states a that commute with A are in- 
cluded in the domain of minimisation of the left-hand side, 
hence equality holds. □ 



For later reference we define the auxiliary function 

s(x) := min S((r + x, 1 — r — x)\\(r, 1 — r)), (26) 

0<r<l— x 

for < x < 1. An equivalent expression for this function is 
given by 

s(x) :— min S((r — x,l — r + x)\\(r,l — r)). (27) 

x<r<l 

The first three non-zero terms in its series expansion around 
x = are given by: 

4 19 

s(x) = 2x 2 + -x 4 + —x 6 + 0(x 8 ) (28) 
9 135 

(obtained using a computer àlgebra package). Further calcu- 
lations reveal that some of the higher-order coefficients are 
negative, the first one being the coefficient of x e2 . One can 
easily prové [4] that the lowest order expansion 2x 2 is ac- 
tually a lower bound. It is, therefore, the sharpest quadratic 
lower bound. For vàlues of x up to 1/2, the error incurred 
by considering only the lowest order term in A28I is at most 
6.5%. For larger vàlues of x, the error increases rapidly. In 
fact, when x tends to its maximal value of 1, s(x) tends to in- 
finity, as can easily be seen from the minimisation expression 
(r tends to 0); accordingly, the series expansion diverges. For 
vàlues of x > 4/5, s(x) is well approximated by its upper 
bound 

s(x) < lim S((r + x,l — r — x)\\(r,l — r)) 

r — >1 — x 

= -log(l - x). 

This is illustrated in Figure[J 

Let us now come back to Eq. \22\ . with a and A diagonal, 
and 1 1 1 . 1 1 1 any unitarily invariant norm. Let a and A have di- 
agonal elements a k and A k , respectively. Fixing A, we min- 
imise first over er. This is a convex problem and any local 



minimum is automatically a global minimum |2]. The corre- 
sponding Lagrangian is 

C = a k(l + Afe/afc) log(l + A k /a k ) - v I ^ a k - 1 I . 

k V k ) 

(29) 

The derivative of the Lagrangian with respect to a k is 
dC 

— = log(l + Afe/crfc) - A k /a k - v. (30) 
óa k 

This must vanish in a critical point, giving the expression 

log(l + A k /a k ) = Afc/o-fe + v. (31) 

Now note that the equation log(l + x) — x — b, for 6 < has 
only two real solutions, one positive and one negative, and 
none for b > 0. Therefore, for any k A k /a k can assume only 
one of these two possible vàlues. Let K be an integer between 
1 and d — 1. Without loss of generality we can set 



Afc/o-fc 



p , 1 < k < K, 
-Cm, K < k <d, 



(32) 



where c p and c m are positive numbers, to be determined along 
with K. The requirement J2 k A k = imposes 



K d 

k=K+l 



which upon defining 



turns into 



K 
k=l 



c p r = c m (l - r) =: c. 



(33) 



(34) 



Substituting Eqs. J32i and J33I . the function to be minimised 
becomes 

r(l + c p ) log(l + c p ) + (1 - r)(l - Cm) log(l - c m ), 
which, given Eq. J34> . can be rewritten as 

S((r + c,l-r-c)\\{r,l-r)). 
The one remaining constraint 1 1 1 A| 1 1 = T likewise becomes 

HKCpCTl, . . . , C p (JK, -C m (JK+l, ■ ■ ■ , -C m CTd)\\ \ = T. 

Defining 

t' := (ai,...,a K )/r, 

t" := {(TK+l, ■ ■ - ,o-d)/(l - r), 

this turns into 

T=\\\(c p rr';-cm(l-r)r")\\ 
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where we have exploited the homogeneity of a norm. Note 
that by their definition, r' and t" are vectors consisting of 
positive numbers adding up to 1 . 

The minimisation itself thus turns into 

S m ia(T)= min S((r + c,l-r-c)\\(r,l-r)), 

r.r' \t" 

where c := T/|||(r';r")|||. Quite obviously, the minimum 
over c is obtained for the smallest possible c, hence 

S'miri (T) = min5((r + T/7,l-r-r/7)||(r,l-r)) 

r 

= s(T/ 7 ), 



with 



7 = max |||(r'; t")\ 



By convexity of a norm, this maximum is obtained in an ex- 
treme point, so 

Incidentally, by Lemma[5] this value is also the maximum 



max | \\p — a\ 



over all possible states p and a, i.e., 7 is the largest possible 
value of T for the given norm. We have thus proven 

Theorem 4 For any unitarily invariant norm \ \\.\\\, we have 
the sharp lower bound 



S(p\\a)>s(\\\p-a\\\/\\\F\ 
A few remarks are in order at this point: 



(35) 



1. Within the setting of finite-dimensional systems, this 
theorem generalises a result of Hiai, Ohya and Tsukada 
lllOl ll 811 for the trace norm to all unitarily invariant 
norms. This paper also uses the technique of getting 
lower bounds by projecting on an abelian subalgebra 
and then exploiting the case of a two-dimensional sup- 
port as the worst case scenario. 

2. If we take the Hiai-Ohya-Tsukada result for granted and 
combine it with Lemma|3] we immediately get 



VIL SHARP UPPER BOUNDS IN TERMS OF NORM 
DISTANCE 

Let now S ma , x (T, (3) be the largest relative entropy between 
p and that have a normalised distance of exactly T and 

A m in(cr) = (3, SO let 



S max ( T >í3) : = max<! S(p\\a) : " f| 

p- a I lll-rl 



T, A min (cr) = (3 
(36) 

The need for the extra parameter (3 arises because for (3 — 0, 
<Smax is infinite, as can be seen by taking different pure states 
for p and a. We can rephrase this definition as 

SmaxíT 1 , /3) - maxJ S(a + A\\a) : = T, Tr[A] = 0, 

A,o- 1 1 \r 1 1 1 



cr + A > 0, A min (cr) = (3 



(37) 



Because A commutes with the identity matrix, there is a 
unique common least upper bound on (31 and — A, which we 
will denote by max(/31, — A). In the eigenbasis of A, this is 
a diagonal matrix with diagonal elements max(/3, — A,). The 
constraints a > (3 and a + A > can therefore be combined 
into the single constraint 



a > max(/31, -A). 
The extremal a obeying this constraint are 

a = max(/31, -A) + r)\ip)(ip\, 
where tp is any state vector, and 

i] := 1 -Tr[max(^l,-A)]. 



(38) 



(39) 



(40) 



Therefore, the constrained maximisation over can be re- 
placed by an unconstrained maximisation over all pure states 
of the function 



5(A + max(/31,-A) +r\\i\))l$\ \\ 
max(/31,-A) + t]\i(j) . 



(41) 



Of course, all of this puts constraints on A as well. Indeed, 
in order that states a obeying J38i exist, max(/31, — A) must 
obey the condition 



Tr[max(/31,-A)] < 1. 



(42) 



We now have to distinguish between two cases: the case d 
2, and the case d > 2. 



S(p\\a) > siWp-alU/WFlU) 
> sdllp-alll/III^H). 

3. The divergence of s at x — 1 is easily understood. The 
largest norm difference between two states occurs for 
orthogonal pure states, in which case their relative en- 
tropy is infinite. 



A. The case d = 2 

For the d = 2 case, the maximisation over A is trivial. In its 
eigenbasis, A is a múltiple of Diag(l, —1) = F. Hence, fix- 
ing the eigenbasis of A (which we can do because of unitary 
invariance of the relative entropy), and fixing 

|||A|||/|||F||| =T, (43) 
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actually leaves just one possibility for A, namely A = TF, 
The term max(/31, — A) leads to two cases: T < (3 and T > 
0. 

The condition T < (3 implies, by Lemma[3] that HAH^ < 
(3 and, hence, 

max(/31,-A) = Diag(/3,/3), 
X] = 1-2(3. 

The remaining maximisation of (14 1 i is therefore given by 

max S(Diag(/3 + T, /3 - T) + (1 - 2/3) |V) (VI 1 1 

Diag(/3,/3) + (1-2/3) |V) (V|). (44) 

Positivity of r] requires (3 < 1/2. By unitary invariance of 
the relative entropy, and invariance of diagonal states under 
diagonal unitàries (phase factors), we can restrict ourselves to 
vectors ip of the form V = (cos a, sin a) T . 

Lemma 9 For a state vector ip = (cosa, sin a) T , the func- 
tion to be maximised in (1441 is convex in cos(2a). 

Proof. Let D\ be the determinant of the first argument. It is 
linear in t := cos(2a): 

L>i = (3 2 - T 2 + (1 - 2/3)(/3 - Tt). 

After some bàsic àlgebra involving eigensystem decomposi- 
tions of the states, the function to be maximised in (I44i is 
found to be given by 

f(x) := ((l-i)tog(l-a0 + (l + a0]og(l + a:))/2 
+ (-l + 2/3-2Tí)(log(l-/?)-log/?)/2 
- (log(4-4/?) + log/3)/2, 

where 2; = (1 -ADi) 1 ' 2 . We will now show that this function 
is convex in t. Since the second and third terms are linear in i, 
we only need to show convexity for the first term. The series 
expansion of the first term is 



((l-z)log(l-z) + (l+x) log(l+x))/2 = J2 



„2k 



' 2k(2k- 1) 

Every term in the expansion is a positive power of x 2 with 
positive coefficient and is therefore convex in x 2 , which itself 
is linear in t. The sum is therefore also convex in t. □ 



By the above Lemma, the maximum of the maximisation 
over V is obtained for extremal vàlues of t, that is: either t/j = 
(1, 0) T or V = (0, 1) T . Evaluation of the maximum is now 
straightforward and it can be checked that the choice V = 
(1, 0) T always yields the largest value of the relative entropy. 

We will now more specifically look at the case where T > 
(3. In this case, we get 

max(/31,-A) = Diag(/3,T), 
77 = 1 - (3 - T, 



and the remaining maximisation of (1411 is given by 

maxSpiagCí? + T, 0) + (1 - (3 - T)|V)(VI II 

Diag(/?,T) + (l-/3-T)|V)(VI)· (45) 

Positivity of r\ requires (3 < 1/2 and T < 1 — (3. Again, we 
can restrict ourselves to states t/j — (cos a, sin a) T . We also 
have the equivalent of Lemma |9l which needs more work in 
this case: 

Lemma 10 For a state vector -0 = (cos a, sin a) T , the func- 
tion to be maximised in \45l is convex in cos(2a). 

Proof. Let D\ and D2 be the determinant of the first and sec- 
ond argument, respectively. Both are linear in t :— cos(2a): 

Di = (l-/3-T)(/3 + T)(l-í)/2 

D 2 = {{fi + T-13 2 -T 2 ) + {l~[3-T){T-l3)t)/2. 

In the (Di, Z?2)-plane, this describes a line segment with gra- 
dient 



K := - 



T-(3 
T + /3' 



which lies in the interval [—1,0]. 

Again, after some bàsic àlgebra, the function to be max- 
imised in (05} is identified to be /((l - 4£>i) 1/2 , (1 - 
4D 2 ) 1/2 ), where 

f(x,y) := ((l-x)log(l-s) + (l + x)log(l + s))/2 
+ {{x 2 +y 2 -2y- AT 2 ) log(l - y) 
- (x 2 + y 2 + 2y-4T 2 )log(l + y))/4y. 

We will now show that f((l - ADx)^ 2 , (1 - 4D 2 ) 1/2 ) is con- 
vex in t. First, note that 

f(x,y) =f Q (x 1 y)+T 2 f 1 (y). 

The term fi(y) is itself convex in t: its series expansion is 

00 2k 

h(y) = (log(l + y)~ log(l - y))/y = 2 ]T , 

fc=0 

which by the positivity of all its coefficients is convex in y 2 , 
and y 2 is linear in t. 

The other term, fo(x, y) is given by a sum of three terms 

f (x,y) = Í((l-a01og(l-a0 + (l+a0]og(l+a0) 
+ J((y - 2) log(l - y) - (y + 2) log(l + y)) 

,2 

- ^(log(l + y)-log(l-y))/y. 

Replacing each of the three terms by its series expansion 
yields 



fo(x,y) = 



,;2k 



< 2k(2k- 1) 
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y 



2k 



k=l 



2k(2k- 1) 

2k 



2 ^ 2k + 1' 

fe=0 

To show that this function is convex in t, we will evaluate it 
along the curve 



x = u 



V = v- 



P 

Kp, 



with gradient K between and — 1, and u and v lying in the in- 
terval [0, 1], and check positivity of its second derivative with 
respect to p at p = 0: 



d P 2 



p=0 



fo(x,y) = 



1 



,k-2 



2k - 1 



+(*-!) K 



fe=2 

l)K-2 



2k-l 



-K z 



k 



2k + l 



,,fc-2 



The coefficient of u k ~ 2 is clearly positive, hence the deriva- 
tive is positive if the coefficient of v k ~ 2 is positive for all al- 
lowed vàlues of u and K. The worst case occurs for u = 1, 
yielding a coefficient 

k -K(2 + 4k 



K ik-m-2_ K2 



K) 



2k-l 2k + l (2fc-l)(2fc + l) 

For vàlues of K between and —1, this is indeed positive. □ 



By the above Lemma, the maximum of the maximisation 
over ip is obtained for extremal vàlues of t, that is: either ijj = 
(1, 0) T or ip = (0, 1) T . Evaluation of the maximum is again 
straightforward, and calculations show that sometimes ip — 
(1,0) T yields the larger value, and sometimes i/j = (0, 1) T . 
In this way we have obtained the upper bounds: 



Theorem 5 Let A = p - a, T = \ \\ 

A m i n (<7). For d = 2, and T < (3, 

S( P \\a) < (T+l-/3)log 



A|||/|||F||| and /3 



T + 1-0 



1-/3 

+ (/3-T)log(l-T//3). 



(46) 



For d = 2, and T > (3, 



S(p\\a) < max(-log(l-T), 

(/3 + T)log(l+T//3) + 
(l-/?-T)log(l-T/(l -/?))). (47) 

It is interesting to study the behaviour of the bound in the 
case of large (3. More specifically, an approximation for bound 
d4oY vàlid for T < (3, is 



s(p\W) < 



rpk 



oo 

^ k(k 

i2 



1 (~l) fc 

1) \(3 k {l-(3) k 



T z 



2/3(1-/3)' 



(48) 



Figure|2]illustrates the combined upper bounds of Theorem 
|5](<i = 2) for various vàlues of (3. 



B. The case d > 2 

In case d is larger than 2, it is not clear how to proceed in 
the most general setting, for general UI norms, as the maximi- 
sation over A must explicitly be performed. In the following, 
we will restrict ourselves to using the trace norm, which is 
in some sense the most important one anyway. That is, the 
requirements on A are 



l|A||i = 2T, 
Tr[A] = 0, 
Tr[max(/31,-A)] < 1. 



(49) 
(50) 
(51) 



The following very simple Lemma will prové to be a powerful 
tool. 

Lemma 11 For all A, B, and C, positive semi-definite oper- 
at ors, 

S(A + C\\B + C) < S(A\\B). 

Proof. First note that for any a > 0, 

S{aA\\aB) = Tr[aA(log(aA) - log(aS))] 
= aS(A\\B). 

This, together with joint convexity of the relative entropy in 
its arguments (which need not be normalised to trace 1), leads 
to 

S(A + C\\B + C) = 2S(^±^\\^±^) 

< S(A\\B) + S(C\\C) 
- S(A\\B). 

□ 



The Lemma immediately yields an upper bound on J41I : 
letting 

a := max(/31, -A) + r)\ip)(ip\, 
such that we obtain 

^(A + ct || cr) < 5(A + max^l, -A) || max(/31,-A)) 
= 5((A + /31)+ || /31 + (A + /31)_). (52) 

To continue, we consider two cases. 

Case 1 : When T < (3, the requirement ( 15 ll i is automat- 
ically satisfied, and max(/31, — A) = /31. Let A + and A_ 
be the positive and negative part of A, respectively. That is, 
A = A + — A_, with A + and A_ non-negative and orthog- 
onal. Because we are using the trace norm we can rewrite the 
conditions on A as 

||A||i = Tr[A+]+Tr[A_] =2T, 
Tr[A] = Tr[A+] - Tr[A_] = 0, 
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0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 

(C) (d) 

FIG. 2: Upper bounds of Theorem[5]on S = S(p\\a) vs. the rescaled norm distance T = \\\p — a\ | |/| | \F\ 1 1, for d = 2, and for vàlues of 
smallest eigenvalue of a (a) [3 — 0.1, (b) 0.2, (c) 0.3, and (d) 0.5. The two regimes T < (3 and /? < T < 1 — j3 can be clearly identified. For 
ease of comparison, each curve is shown superimposed on the curves for (5 — 0.1, 0.2, 0.3, 0.4 and 0.5 (in grey). 



hence 

Tr[A + ] = Tr[A_] = T. 

Bv LemmafTTI W\\ is upper bounded by 5(A + /31||/31). By 
convexity, its maximum over A + , A_ > 0, with Tr[A + ] = 
Tr[A_] = T, is obtained in A + and A_ of rank 1, giving as 
upper bound 

S(A + a\\a) <((3 + T) log + ((3-T) log 

The upper bound can be achieved in dimensions d > 3 for all 
vàlues of T < (5 by setting A = TF and tjj = e 3 . 

Case 2: In the other case, when T > f3, we have to deal 
with condition (15 1 1 . To do that we split A into three non- 
negative parts, 

A = A+-A -A_, (53) 

with A + , Ao and A_, operating on orthogonal subspaces V+, 
Vq and V- , respectively, with 

A+ > 0, 
> -A > -/31 , 
> -A_. 



We denote the projectors on these subspaces by 1 + , 1q, and 
1_. Then 

(A + /3)+ = A + -A +/51 +0 , 

where l +0 := 1+ + lo- The conditions on A, Tr[A] = and 
Tr[|A|] = 2T translate to 

Tr[A+] = Tr[A ] + Tr[A_] = T. 

Due to the orthogonality of positive and negative part, ( 15 21 
can be simplified to S((A + /31)+ 1 1 01+o). After subtracting 
(31 — A from both arguments, we get 

5(A + +/31 + ||/31 + + Ao), 

which is an upper bound on d52i . by Lemma ^] Ignoring 
condition ( 15 1 1 on A, we get 

Smax< max 5(A++/31 + ||/31 + ). 

Tr[A_j_]=T 

By convexity, the maximum is obtained for A + rank 1, giving 
the upper bound 

S max <(T + /3)log((T + 0)/(3). 
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FIG. 3: Comparison between upper bounds 1 191 and I55Í - I56I on 

S = S(p\\a) vs. the trace norm distance T = \\p — <r||i/2, for 
various vàlues of f3, the smallest eigenvalue of a. The upper set of 
dashed curves depict bound {T9j (with d = 3) for (3 = 0.1 (lower 
curve), 0.2, 0.3, 0.4 and 0.5 (upper curve). The lower set of full line 
curves depict bounds I55t - i56l for /3 = 0.1 (upper curve), 0.2, 0.3, 
0.4 and 0.5 (lower curve). The two regimes T < f3 and f3 < T < 
1 — /3 can be clearly seen. 



To see that this bound is sharp for (almost) any value of T, 
consider the two states 

p = Diag(T + /?,0,0 XJ ,/3 xif ,/3 + 77), 
a = F>mg((3,T-J(3,(3 XJ ,(3 xK : (3 + r ] ), 
n := l-T -{d-l- J)(3. 

Here, J is an integer between and d — 3 and k = d — 3 — J. 
Conditions on J are J(3 < T (so that a > 0) and T < 1 - 
(d — 1 — J)/3 (so that 77 > 0). This choice of states can thus be 
obtained for (3 < T < 1 — 2 (3. It can be seen that | \p — a\ |i = 
2T and 



S{p\\a) = {T + (3)\og{{T + (3)/(3). 



(54) 



The result of the foregoing can be subsumed into the following 
theorem. 

Theoremó Let A = p - a, T = ||A||i/2 and fi = A min (cr). 
IfT < 13 then 

(3 + T ,„ 0-T 



S{p\\a) < (/3 + T)log 



T + {(3-T)\og^—^. (55) 



f3 ' (3 

and this upper bound is sharp when d > 2. If f3 <T <1 — (3 



then 



S(p\\a) < (/3 + T) los 



(3 + T 
(3 ' 



(56) 



When d > 2, this bound is sharp for (at least) (3 < T < 
1-2(3. 

Figure|5]illustrates these bounds and shows their superiority 
to the previously obtained bound (I19> . 

Again, it is interesting to look at the bound for large (3. An 
approximation for bound d55i . vàlid for T <§; (3, is given by 



s(pIH<E 



-2k 



k=l 



k(2k-l)(3 2k 



2k-l 



T2 

(3 • 



(57) 



VIII. APPLICATION TO STATE APPROXIMATION 

In the following paragraph we will give an application of 
our bounds to state approximation. Consider a state p on a 
Hilbert space H, and a sequence {a n } n where a n is a state 
on H® n . As before, the sequence is said to asymptotically 
approximate p if for n tending to infinity, ||cr n 
Tr |cr„ — p® n \ tends to zero. Let us define T n as 

T n :=Tr\p® n -a n \/2. 

Because of the lower bound 0, we get 

S'n := S(p' 



„®n|| — 



and this bound is sharp. Hence, T n goes to zero if S n does. 

On the other hand, T n going to zero does not necessarily 
imply S n going to zero. Indeed, S n can be infinite for any 
finite value of n when p® n is not restricted to the range of 
a n . In particular, the relative entropy distance between two 
pure states is infinite unless the states are identical. At first 
sight, this seems to render the relative entropy useless as a 
distance measure. Nevertheless, sense can be made of it by 
imposing an additional requirement that the range of a n must 
contain the range of p® n . Let us then restrict a n to the range 
of p® n , as the relative entropy only depends on that part of 
o n . Letting d be the rank of p, the dimension of the range of 
p® n is d n . Let (3 n be the smallest non-zero eigenvalue of a n 
on that range; (3 n is at most 1 jd n . 

The behaviour of the relative entropy then very much de- 
pends on the relation between (3 n and T n . Since (3 n de- 
creases at least exponentially, we only need to consider the 
case T n > (3 n , and use the bound i56\ 



S n < (/3 n + T n )log 1 + 

V Pn 

In the worst-case behaviour of T n (T n / (3 n tending to infinity) 
the bound can be approximated by 

S n < r„log^ =T„(logT n -log/3„) 

Pn 

» T„|log/U 

To guarantee convergence of S n we therefore need T n to con- 
verge to at least as fast as 1/| \og(3 n \, which in the best case 
goes as í/n. Note that bound fl!9l > yields the same require- 
ment, but as this bound is not a sharp one it could have been 
too strong a requirement. This gives us the subsequent theo- 



Theorem 7 Considera state p on a finite- dimensional Hilbert 
space 7Í and a sequence {a n }n of states a n on 7i®". The 
sequence {a n } n asymptotically approximates p in the trace 
norm, if 



lim S(p® n \\(T. 



0. 



(58) 



Conversely, if the range of a n includes the range of p® n and 
| |p® n — a n 1 1 1 convé rges to zero faster than 1 / 1 log (3 n \, where 
(3 n is the minimal eigenvalue of er„ restricted to the range of 
p® n ,thenliia n ^ oo S(p® n \\o- n ) = 0. 
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IX. SUMMARY 

In this paper, we have discussed several lower and upper 
bounds on the relative entropy functional, thereby sharpening 
the notion of continuity of the relative entropy for states which 
are close to each other in the trace norm sense. 

The main results are the sharp lower bound from Theorem 
4, and the sharp upper bounds of Theorems 5 (d = 2) and 6 
(d > 2). Theorems 4 and 5 give the relation between relative 
entropy and norm distances based on any unitarily invariant 
norm, while Theorem 6 holds only for the trace norm dis- 



tance. These results have been obtained employing methods 
from optimisation theory. 
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